Explore moe #26

xzyaoi · 2024-03-15T09:49:44Z

No description provided.

…sy compression function to support moe

Copilot

Copilot reviewed 10 out of 22 changed files in this pull request and generated 7 suggestions.

Files not reviewed (12)

notebooks/explore.ipynb: Evaluated as low risk
deltazip/modeling/_base.py: Evaluated as low risk
deltazip/modeling/init.py: Evaluated as low risk
deltazip/init.py: Evaluated as low risk
deltazip/modeling/auto.py: Evaluated as low risk
deltazip/modeling/_utils.py: Evaluated as low risk
deltazip/lossless/compressor.py: Evaluated as low risk
deltazip/modeling/_const.py: Evaluated as low risk
deltazip/core/sparsegpt.py: Evaluated as low risk
notebooks/playground.py: Evaluated as low risk
deltazip/modeling/gpt_neox_moe.py: Evaluated as low risk
deltazip/modeling/moe/base_generation_strategies.py: Evaluated as low risk

Comments skipped due to low confidence (4)

cli/chat_moe.py:27

The args object is not defined within the chat function. Use model_path instead.

delta_model = AutoDeltaZipModelForCausalLM.from_compressed(args.model_path, strict=True, device="cpu", unpack=True, trust_remote_code=True)

cli/chat_moe.py:49

The device parameter in TextGenerationPipeline should be an integer, not a torch.device object. Use device=0 instead.

delta_model.to(torch.device("cuda"))

cli/compress_moe.py:99

The assertion should be assert base_model is None or is_moe, "You can only compress a moe without a base representation.".

assert base_model is None and is_moe, "You can only compress a moe without a base representation."

cli/compress_moe.py:105

[nitpick] The repeated block of code for loading the model should be refactored to avoid duplication.

if args.target_model == "gpt_neox_moe": ... elif args.target_model == "llama_moe": ... else: ...

Copilot · 2024-12-04T15:41:16Z

cli/save_moe.py

+    delta_model = None
+    config=None
+    if model_type == "gpt-neox-moe":
+        with open(f"{args.model_path}/base/base_model/config.json", "r") as fp:


The variable 'args' is used instead of 'model_path'. It should be 'model_path' instead of 'args.model_path'.

Suggested change

with open(f"{args.model_path}/base/base_model/config.json", "r") as fp:

with open(f"{model_path}/base/base_model/config.json", "r") as fp:

Copilot · 2024-12-04T15:41:17Z

cli/save_moe.py

+    base_weights = load_file(f"{model_path}/base/base_weights.safetensors")
+
+    delta_model = AutoDeltaZipModelForCausalLM.from_compressed(
+        args.model_path, strict=True, device="cpu", unpack=True, trust_remote_code=True, model_config=config, custom_model = delta_model


The variable 'args' is used instead of 'model_path'. It should be 'model_path' instead of 'args.model_path'.

Suggested change

args.model_path, strict=True, device="cpu", unpack=True, trust_remote_code=True, model_config=config, custom_model = delta_model

model_path, strict=True, device="cpu", unpack=True, trust_remote_code=True, model_config=config, custom_model = delta_model

Copilot · 2024-12-04T15:41:17Z

cli/chat_moe.py

+    logger.info("Tokenizer loaded")
+
+    logger.info("Loading base_model")
+    base_model = transformers.AutoModelForCausalLM.from_pretrained(f"{model_path}/base/base_model.pt", trust_remote_code=True)


The path should be a directory containing model configurations, not a .pt file.

Suggested change

base_model = transformers.AutoModelForCausalLM.from_pretrained(f"{model_path}/base/base_model.pt", trust_remote_code=True)

base_model = transformers.AutoModelForCausalLM.from_pretrained(f"{model_path}/base", trust_remote_code=True)

Copilot · 2024-12-04T15:41:17Z

cli/chat_moe.py

+    print(f"base_weights: {base_weights.keys()}")
+
+    for expert_name, expert_weight in base_weights.items():
+        prefix, suffix = expert_name.split(EXPERT_ID_PLACEHOLDER)


The split method might fail if EXPERT_ID_PLACEHOLDER is not found in expert_name. This should be handled properly.

Suggested change

prefix, suffix = expert_name.split(EXPERT_ID_PLACEHOLDER)

if EXPERT_ID_PLACEHOLDER not in expert_name: continue

Copilot · 2024-12-04T15:41:18Z

cli/compress_moe.py

+from deltazip.modeling._const import EXPERT_ID_PLACEHOLDER
+from loguru import logger
+from safetensors.torch import save_file
+import safetensors


[nitpick] The safetensors module is imported but not used directly. Use it directly or remove the import.

Copilot · 2024-12-04T15:41:18Z

cli/compress_moe.py

+    # Make sure we only save the non-fc layers (i.e the layers where MoE isn't applied)
+    for name in to_remove:
+        del sd[name]
+    model.save_pretrained(f"{args.outdir}/base/base_model", state_dict=sd)


The save_pretrained method does not accept a state_dict parameter. Set the state_dict directly on the model before calling save_pretrained.

Suggested change

model.save_pretrained(f"{args.outdir}/base/base_model", state_dict=sd)

model.load_state_dict(sd)

Copilot · 2024-12-04T15:41:18Z

deltazip/modeling/llama_moe.py

+    layer_type = "LlamaDecoderLayer"
+    layers_block_name = "model.layers"
+    outside_layer_modules = ["model.embed_tokens", "model.norm"]
+    inside_layer_modules = [f"moe.mlp.{EXPERT_ID_PLACEHOLDER}.up_proj", f"moe.mlp.{EXPERT_ID_PLACEHOLDER}.gate_proj", f"moe.mlp.{EXPERT_ID_PLACEHOLDER}.down_proj"],


The inside_layer_modules attribute is defined as a tuple due to the trailing comma. It should be a list to avoid potential issues.

Suggested change

inside_layer_modules = [f"moe.mlp.{EXPERT_ID_PLACEHOLDER}.up_proj", f"moe.mlp.{EXPERT_ID_PLACEHOLDER}.gate_proj", f"moe.mlp.{EXPERT_ID_PLACEHOLDER}.down_proj"],

inside_layer_modules = [f"moe.mlp.{EXPERT_ID_PLACEHOLDER}.up_proj", f"moe.mlp.{EXPERT_ID_PLACEHOLDER}.gate_proj", f"moe.mlp.{EXPERT_ID_PLACEHOLDER}.down_proj"]

BoykoBorisov and others added 5 commits March 1, 2024 14:31

add notebook with MoE delta experiment

45e2338

cleanup

7b68ff1

Add plotting for both layers

d2e5753

Start implementation

348dc90

Skeleton moe

bc20339

xzyaoi marked this pull request as draft March 15, 2024 09:49

BoykoBorisov and others added 8 commits March 15, 2024 15:14

Add stubs

c7239ad

Add utility functions for weight extraction; start preparation of los…

d32c6ec

…sy compression function to support moe

Start working on CLI moe compression; bugfixes and minor changes

4b583b6

Fix base model weight lookup

fb071d5

compression pipeline

5c6a0c7

Adds infra for benchmarking

61c931b

Bugfixes, changes for GPTNeoX Experiments

74cabfa

Add llama-moe support

807748d

xzyaoi closed this Dec 4, 2024

xzyaoi reopened this Dec 4, 2024

xzyaoi changed the base branch from init to main December 4, 2024 10:30

xzyaoi marked this pull request as ready for review December 4, 2024 10:30

xzyaoi requested a review from Copilot December 4, 2024 15:36

Copilot AI reviewed Dec 4, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Explore moe #26

Explore moe #26

xzyaoi commented Mar 15, 2024

Copilot AI left a comment

Copilot AI Dec 4, 2024

Provide additional feedback

Please help us improve GitHub Copilot by sharing more details about this comment.

Copilot AI Dec 4, 2024

Provide additional feedback

Please help us improve GitHub Copilot by sharing more details about this comment.

Copilot AI Dec 4, 2024

Provide additional feedback

Please help us improve GitHub Copilot by sharing more details about this comment.

Copilot AI Dec 4, 2024

Provide additional feedback

Please help us improve GitHub Copilot by sharing more details about this comment.

Copilot AI Dec 4, 2024

Provide additional feedback

Please help us improve GitHub Copilot by sharing more details about this comment.

Copilot AI Dec 4, 2024

Provide additional feedback

Please help us improve GitHub Copilot by sharing more details about this comment.

Copilot AI Dec 4, 2024

Provide additional feedback

Please help us improve GitHub Copilot by sharing more details about this comment.

	with open(f"{args.model_path}/base/base_model/config.json", "r") as fp:
	with open(f"{model_path}/base/base_model/config.json", "r") as fp:

	args.model_path, strict=True, device="cpu", unpack=True, trust_remote_code=True, model_config=config, custom_model = delta_model
	model_path, strict=True, device="cpu", unpack=True, trust_remote_code=True, model_config=config, custom_model = delta_model

	base_model = transformers.AutoModelForCausalLM.from_pretrained(f"{model_path}/base/base_model.pt", trust_remote_code=True)
	base_model = transformers.AutoModelForCausalLM.from_pretrained(f"{model_path}/base", trust_remote_code=True)

	prefix, suffix = expert_name.split(EXPERT_ID_PLACEHOLDER)
	if EXPERT_ID_PLACEHOLDER not in expert_name: continue

	model.save_pretrained(f"{args.outdir}/base/base_model", state_dict=sd)
	model.load_state_dict(sd)

	inside_layer_modules = [f"moe.mlp.{EXPERT_ID_PLACEHOLDER}.up_proj", f"moe.mlp.{EXPERT_ID_PLACEHOLDER}.gate_proj", f"moe.mlp.{EXPERT_ID_PLACEHOLDER}.down_proj"],
	inside_layer_modules = [f"moe.mlp.{EXPERT_ID_PLACEHOLDER}.up_proj", f"moe.mlp.{EXPERT_ID_PLACEHOLDER}.gate_proj", f"moe.mlp.{EXPERT_ID_PLACEHOLDER}.down_proj"]

Explore moe #26

Are you sure you want to change the base?

Explore moe #26

Conversation

xzyaoi commented Mar 15, 2024

Copilot AI left a comment

Choose a reason for hiding this comment

Copilot AI Dec 4, 2024

Choose a reason for hiding this comment

Copilot AI Dec 4, 2024

Choose a reason for hiding this comment

Copilot AI Dec 4, 2024

Choose a reason for hiding this comment

Copilot AI Dec 4, 2024

Choose a reason for hiding this comment

Copilot AI Dec 4, 2024

Choose a reason for hiding this comment

Copilot AI Dec 4, 2024

Choose a reason for hiding this comment

Copilot AI Dec 4, 2024

Choose a reason for hiding this comment